Recently, types of cameras such as wearable cameras and action cameras have been widely used in fields such as sports. With such cameras, continuous image-capturing is performed for a long time in many cases and composition easily becomes monotonous and thus there is a case in which images (pictures, videos, or the like) that have been captured are difficult to enjoy in their original states. Accordingly, a technology is desired for generating a summary image obtained by abbreviating interesting points of images that have been captured.
Regarding such a technology, for example, technologies for switching images to match background music (BGM) have been developed, as disclosed in Patent Literatures 1 and 2 below. More specifically, Patent Literature 1 below discloses a technology for switching image data at each phrase division timing of music or at a timing of multiple phrase divisions.